Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data

نویسندگان

Chia-Hao Shen

Janet Y. Sung

Hung-yi Lee

چکیده

Audio Word2Vec offers vector representations of fixed dimensionality for variable-length audio segments using Sequenceto-sequence Autoencoder (SA). These vector representations are shown to describe the sequential phonetic structures of the audio segments to a good degree, with real world applications such as query-by-example Spoken Term Detection (STD). This paper examines the capability of language transfer of Audio Word2Vec. We train SA from one language (source language) and use it to extract the vector representation of the audio segments of another language (target language). We found that SA can still catch phonetic structure from the audio segments of the target language if the source and target languages are similar. In query-by-example STD, we obtain the vector representations from the SA learned from a large amount of source language data, and found them surpass the representations from naive encoder and SA directly learned from a small amount of target language data. The result shows that it is possible to learn Audio Word2Vec model from highresource languages and use it on low-resource languages. This further expands the usability of Audio Word2Vec.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing the Impact of Audio-Visual Input Enhancement on Collocation Learning in Traditional and Mobile Learning Contexts

: This study investigated the impact of audio-visual input enhancement teaching techniques on improving English as Foreign Language (EFL) learnersˈ collocation learning as well as their accuracy concerning collocation use in narrative writing. In addition, it compared the impact and efficiency of audio-visual input enhancement in two learning contexts, namely traditional and mo...

متن کامل

Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder

The vector representations of fixed dimensionality for words (in text) offered by Word2Vec have been shown to be very useful in many application scenarios, in particular due to the semantic information they carry. This paper proposes a parallel version, the Audio Word2Vec. It offers the vector representations of fixed dimensionality for variable-length audio segments. These vector representatio...

متن کامل

The Effect of Gloss Type and Mode on Iranian EFL Learners’ Vocabulary Acquisition

Vocabulary is an important component of language proficiency which provides the basis for learners’ performance in other skills. But, since vocabulary learning seems to be so demanding, learners tend to forget newly-learnt words quite soon. In order to identify vocabulary learning conditions which can produce a more lasting effect, this study investigated the effect of three kinds of gloss cond...

متن کامل

The Efficacy of Audio Input Flooding Tasks on Learning Grammar: Uptake of Present Tense

This study sought to probe the role of input flooding through listening tasks on the uptake of simple present tense and the present progressive tense among pre - intermediate English a s Foreign Language ( EFL ) learners. To comply with the objective, an experimental design was adopted. 55 pre - intermediate learners participated in the study. They were randomly divided into one control group, ...

متن کامل

The Effect of Pre-teaching New Vocabulary Items via Audio-Visuals on Iranian EFL Learners’ Reading Comprehension Ability

This study aimed to investigate the effect of pre-teaching new vocabulary items via audio-visuals on Iranian EFL learners’ reading comprehension ability. The question this study tried to answer is if pre-teaching new vocabulary items via audio-visuals have any effect on Iranian EFL learners’ reading comprehension ability. To find the answer to the question, 30 intermediate level stu...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1707.06519 شماره

صفحات -

تاریخ انتشار 2017

Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data

نویسندگان

چکیده

منابع مشابه

Comparing the Impact of Audio-Visual Input Enhancement on Collocation Learning in Traditional and Mobile Learning Contexts

Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder

The Effect of Gloss Type and Mode on Iranian EFL Learners’ Vocabulary Acquisition

The Efficacy of Audio Input Flooding Tasks on Learning Grammar: Uptake of Present Tense

The Effect of Pre-teaching New Vocabulary Items via Audio-Visuals on Iranian EFL Learners’ Reading Comprehension Ability

عنوان ژورنال:

اشتراک گذاری